Donot self-allocate IDs in Beam by default. #1809

jianglai · 2022-10-05T20:16:00Z

Per b/250948425, it is dangerous to implicitly allow all Beam pipelines
to create buildables by self allocating the IDs. This change makes it so
that one has to explicitly use self allocated IDs in Beam.

A boolean is added to the pipeline option so that it can be passed to
the beam worker initializer that controls the behavior of the JVM on
each worker. Note that we did not add the option in the metadata.json file
because we did not want people to use the override at run time when launching
a pipeline, due to the risk. As shown in RdePipeline.java, we instead
explicitly hard-code the option in the pipeline. There is nothing that
stops one to supply that option when launching the pipeline, but it's
not advised.

Tested=deployed the pipeline alpha and ran it.

This change is

weiminyu

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @CydeWeys and @jianglai)

core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

    // as you can build the entities.
    if (registryOptions.getUseSelfAllocatedId()) {
      IdService.useSelfAllocatedId();

Instead of using a flag, may be we can provide a bring-your-own-id-service api.

Open up IdService to accept user provided Supplier if the environment is BEAM,
and do not provide a default implementation (throw is on beam and supplier is not provided.).

Code quote:

    if (registryOptions.getUseSelfAllocatedId()) {
      IdService.useSelfAllocatedId();
    }

weiminyu

Reviewable status: 0 of 4 files reviewed, 1 unresolved discussion (waiting on @CydeWeys and @jianglai)

core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

Instead of using a flag, may be we can provide a bring-your-own-id-service api.

Open up IdService to accept user provided Supplier if the environment is BEAM,
and do not provide a default implementation (throw is on beam and supplier is not provided.).

This way we can plug in a raw datastore api to allocate ids.

CydeWeys

Reviewable status: 0 of 4 files reviewed, 2 unresolved discussions (waiting on @CydeWeys, @jianglai, and @weiminyu)

core/src/main/java/google/registry/model/IdService.java line 67 at r2 (raw file):

   * @see #isSelfAllocated
   */
  public static void useSelfAllocatedId() {

Can we use different verbiage for this? E.g. "fake id", "testing id", "fake testing id" or similar? I don't think "self-allocated ID" conveys the right meaning, namely, that this is horribly unsafe for anything in production.

jianglai

PTAL.

Reviewed 4 of 4 files at r1, 2 of 2 files at r2, 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @CydeWeys and @weiminyu)

core/src/main/java/google/registry/beam/common/RegistryPipelineWorkerInitializer.java line 78 at r2 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

This way we can plug in a raw datastore api to allocate ids.

Done.

core/src/main/java/google/registry/model/IdService.java line 67 at r2 (raw file):

Previously, CydeWeys (Ben McIlwain) wrote…

Can we use different verbiage for this? E.g. "fake id", "testing id", "fake testing id" or similar? I don't think "self-allocated ID" conveys the right meaning, namely, that this is horribly unsafe for anything in production.

Reworked the class to take a ID supplier per Weimin's suggestion. I think the supplier used in test can still be called SelfAllocatedIdSupplier as it is an accurate description of what it does, whereas a FakeIdSupplier does not convey exactly what kind of "fake" it is, e.g. is it a constant (which should probably better named ConstantIdSupplier), or random (RandomIdSupplier).

weiminyu · 2022-10-17T19:14:15Z

core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

  private static Supplier<Long> idSupplier =
      RegistryEnvironment.UNITTEST.equals(RegistryEnvironment.get())
          ? SelfAllocatedIdSupplier.getInstance()

Building on Ben's comment, maybe we should call this class NonUniqueIdSupplier, and
make the useSelfAllocatedId BEAM flag a Enum with NonUniqueId, OfyId, and maybe RawDatastoreId if it comes to that.

Code quote:

SelfAllocatedIdSupplier

jianglai

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @CydeWeys and @weiminyu)

core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

Previously, weiminyu (Weimin Yu) wrote…

Building on Ben's comment, maybe we should call this class NonUniqueIdSupplier, and
make the useSelfAllocatedId BEAM flag a Enum with NonUniqueId, OfyId, and maybe RawDatastoreId if it comes to that.

Just to clarify, do you mean to have a lookup map from the Beam enums to ID suppliers that one use to determine the override?

jianglai

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @CydeWeys and @weiminyu)

core/src/main/java/google/registry/model/IdService.java line 48 at r3 (raw file):

Previously, jianglai (Lai Jiang) wrote…

Just to clarify, do you mean to have a lookup map from the Beam enums to ID suppliers that one use to determine the override?

Also, I think the current iteration is explicit enough to discourage the use of ID supplier overrides. It also allows for the further migration to a SQL-provided ID (as another ID provider), by which time we can simply remove the setter, as Beam has access to SQL. It doesn't seem worth it to make the Beam flags more flexible as it would just be throwaway work.

CydeWeys

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @weiminyu)

Per b/250948425, it is dangerous to implicitly allow all Beam pipelines to create buildables by self allocating the IDs. This change makes it so that one has to explicitly request self allocation in Beam. The boolean is added to the pipeline option so that it can be passed to the beam worker initializer that controls the behavior of the JVM on each worker. Note that we did not add the option in the metadata.json file because we did not want people to use the override at run time when launching a pipeline, due to the risk. As shown in RdePipeline.java, we instead explicitly hard-code the option in the pipeline. There is nothing that stops one to supply that option when launching the pipeline, but it's not advised. Tested=deployed the pipeline alpha and ran it.

jianglai

Reviewed 1 of 2 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @weiminyu)

Approval obtained.

jianglai added the WIP Work in progress. Don't review yet. label Oct 5, 2022

jianglai force-pushed the allocate-id branch from 27b46f5 to c128b97 Compare October 5, 2022 22:37

jianglai removed the WIP Work in progress. Don't review yet. label Oct 5, 2022

jianglai requested review from CydeWeys and weiminyu October 5, 2022 22:38

jianglai force-pushed the allocate-id branch from c128b97 to a866ab8 Compare October 5, 2022 22:38

weiminyu requested changes Oct 6, 2022

View reviewed changes

weiminyu previously requested changes Oct 6, 2022

View reviewed changes

CydeWeys requested changes Oct 6, 2022

View reviewed changes

jianglai force-pushed the allocate-id branch 5 times, most recently from 8082cc5 to 733fc8a Compare October 17, 2022 15:44

jianglai added the kokoro:force-run Force a Kokoro build. label Oct 17, 2022

domain-registry-eng removed the kokoro:force-run Force a Kokoro build. label Oct 17, 2022

jianglai requested review from CydeWeys and weiminyu October 17, 2022 17:47

jianglai commented Oct 17, 2022

View reviewed changes

jianglai commented Oct 19, 2022

View reviewed changes

CydeWeys approved these changes Oct 19, 2022

View reviewed changes

jianglai added 2 commits October 19, 2022 14:37

WIP, needs comments

b6227b5

jianglai force-pushed the allocate-id branch from 733fc8a to b6227b5 Compare October 19, 2022 18:37

jianglai added 2 commits October 19, 2022 16:01

Fix bad merge

22f411b

More fixes

38c8904

jianglai commented Oct 19, 2022

View reviewed changes

jianglai enabled auto-merge (squash) October 19, 2022 20:04

jianglai changed the title ~~Does not self allocate IDs in Beam by default.~~ Donot self-allocate IDs in Beam by default. Oct 20, 2022

jianglai merged commit addef17 into google:master Oct 20, 2022

jianglai deleted the allocate-id branch October 20, 2022 00:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Donot self-allocate IDs in Beam by default. #1809

Donot self-allocate IDs in Beam by default. #1809

jianglai commented Oct 5, 2022 •

edited

Loading

weiminyu left a comment

weiminyu left a comment

CydeWeys left a comment

jianglai left a comment

weiminyu commented Oct 17, 2022

jianglai left a comment

jianglai left a comment

CydeWeys left a comment

jianglai left a comment

Donot self-allocate IDs in Beam by default. #1809

Donot self-allocate IDs in Beam by default. #1809

Conversation

jianglai commented Oct 5, 2022 • edited Loading

weiminyu left a comment

Choose a reason for hiding this comment

weiminyu left a comment

Choose a reason for hiding this comment

CydeWeys left a comment

Choose a reason for hiding this comment

jianglai left a comment

Choose a reason for hiding this comment

weiminyu commented Oct 17, 2022

jianglai left a comment

Choose a reason for hiding this comment

jianglai left a comment

Choose a reason for hiding this comment

CydeWeys left a comment

Choose a reason for hiding this comment

jianglai left a comment

Choose a reason for hiding this comment

jianglai commented Oct 5, 2022 •

edited

Loading